Skip to content

Release branch v2.0.0#266

Open
cuonglm wants to merge 113 commits intomainfrom
release-branch-v2.0.0
Open

Release branch v2.0.0#266
cuonglm wants to merge 113 commits intomainfrom
release-branch-v2.0.0

Conversation

@cuonglm
Copy link
Copy Markdown
Collaborator

@cuonglm cuonglm commented Oct 9, 2025

Major Release

This release contains new features, improvements and bug fixes.

Added

  • Rule Matching Engine: Implemented new modular rule matching engine infrastructure with support for configurable rule evaluation order (infrastructure ready, not yet exposed to users)
  • Modular CLI Architecture: Split monolithic CLI command structure into focused, maintainable command files
  • Context Support: Added context.Context support throughout configuration methods for better cancellation and timeout handling

Improvements

  • Logging System: Migrated from zerolog to uber zap logging package with improved performance, better structured logging, and enhanced extensibility
  • CLI Architecture: Refactored monolithic commands.go (1,397 lines) into 13 focused command files, improving maintainability and testability
  • DNS Proxy: Major refactoring of DNS proxy implementation with better code organization, improved error handling, and enhanced separation of concerns
  • Client Information System: Enhanced client discovery and information management with improved DHCP, mDNS, ARP, NDP, and hosts file parsing
  • Service Management: Improved service lifecycle management, enhanced reload functionality, and better status reporting across all platforms
  • Network/OS Abstractions: Improved network and OS abstraction layers with reduced code duplication and more consistent behavior across platforms
  • Configuration System: Added context support and improved logging throughout configuration initialization and bootstrap operations
  • Code Organization: Removed ~3,000 lines of deprecated router-specific code, focusing development on core DNS proxy functionality

Fixes

  • Improved error handling and recovery mechanisms throughout the codebase
  • Enhanced service state management and lifecycle handling

Breaking Changes

⚠️ Server and Router-Specific Integrations Removed: All server platforms and router-specific integration code has been removed, including support for:

  • DD-WRT, dnsmasq, EdgeOS, Firewalla, AsusWRT-Merlin, Netgear/Orbi/Voxel, OpenWRT, Synology, Tomato, and Ubiquiti UniFi OS routers
  • Windows Server

If you were using ctrld with any of these router platforms, you will need to use alternative deployment methods. See the migration guide for details.

Note: All other functionality remains backward compatible. Existing configuration files and CLI commands continue to work without changes.

cuonglm added 30 commits October 9, 2025 16:47
This commit reverts changes from v1.4.5 to v1.4.7, to prepare for v2.0.0
branch codes.

Changes includes in these releases have been included in v2.0.0 branch
already.

Details:

Revert "feat: add --rfc1918 flag for explicit LAN client support"

This reverts commit 0e3f764.

Revert "Upgrade quic-go to v0.54.0"

This reverts commit e52402e.

Revert "docs: add known issues documentation for Darwin 15.5 upgrade issue"

This reverts commit 2133f31.

Revert "start mobile library with provision id and custom hostname."

This reverts commit a198a5c.

Revert "Add OPNsense new lease file"

This reverts commit 7af29cf.

Revert ".github/workflows: bump go version to 1.24.x"

This reverts commit ce1a165.

Revert "fix: ensure upstream health checks can handle large DNS responses"

This reverts commit fd48e6d.

Revert "refactor(prog): move network monitoring outside listener loop"

This reverts commit d71d134.

Revert "fix: correct Windows API constants to fix domain join detection"

This reverts commit 21855df.

Revert "refactor: move network monitoring to separate goroutine"

This reverts commit 66e2d3a.

Revert "refactor: extract empty string filtering to reusable function"

This reverts commit 36a7423.

Revert "cmd/cli: ignore empty positional argument for start command"

This reverts commit e616091.

Revert "Avoiding Windows runners file locking issue"

This reverts commit 0948161.

Revert "refactor: split selfUpgradeCheck into version check and upgrade execution"

This reverts commit ce29b5d.

Revert "internal/router: support Ubios 4.3+"

This reverts commit de24fa2.

Revert "internal/router: support Merlin Guest Network Pro VLAN"

This reverts commit 6663925.
So setting up logging for ctrld binary and ctrld packages could be done
more easily, decouple the required setup for interactive vs daemon
running.

This is the first step toward replacing rs/zerolog libary with a
different logging library.
By adding a logger field to "prog" struct, and use this field inside its
method instead of always accessing global mainLog variable. This at
least ensure more consistent usage of the logger during ctrld prog
runtime, and also help refactoring the code more easily in the future
(like replacing the logger library).
Make nameserver resolution functions more consistent and accessible:
- Rename currentNameserversFromResolvconf to CurrentNameserversFromResolvconf
- Move function to public API for better reusability
- Update all internal references to use the new public API
- Add comprehensive godoc comments for nameserver functions
- Improve code organization by centralizing DNS resolution logic

This change makes the nameserver resolution functionality more maintainable
and easier to use across different parts of the codebase.
- Add timeouts and proper cleanup in Test_osResolver_Singleflight:
  * Implement context timeout
  * Add proper PacketConn cleanup
  * Fix race conditions in error handling
  * Improve atomic value reporting

- Enhance Test_osResolver_HotCache:
  * Add proper timeout context
  * Implement more reliable cache verification
  * Fix potential resource leaks
  * Add deterministic polling intervals

- Add thread safety to Test_Edns0_CacheReply:
  * Implement proper timeout context
  * Add proper resource cleanup
  * Fix concurrent operations handling

The changes improve overall test suite reliability by addressing resource
management, timeout handling, and thread safety concerns across multiple DNS
resolver test cases.
Move client information related functions from client_info_*.go to desktop_*.go files
to better organize platform-specific code and separate desktop functionality from
shared code.

No functional changes.
Improve documentation for Test_prog_parseResolvConfNameservers to clarify that
the old implementation was removed as part of code deduplication effort. The code
for handling resolv.conf was unified into the resolvconffile package to provide
a consistent interface across the codebase.

This change provides better context for future developers about why the
refactoring was done and what benefits it brings.
Add context parameter to validInterfacesMap for better error handling and
logging. Move Windows-specific network adapter validation logic to the
ctrld package. Key changes include:

- Add context parameter to validInterfacesMap across all platforms
- Move Windows validInterfaces to ctrld.ValidInterfaces
- Improve error handling for virtual interface detection on Linux
- Update all callers to pass appropriate context

This change improves error reporting and makes the interface validation
code more maintainable across different platforms.
Move getDNS type definition from dns.go to os_linux.go where it is used.
Remove the now-empty dns.go file. This change improves code organization
by keeping platform-specific types with their implementations.
Break down the large DNS handling function into smaller, focused functions
with clear responsibilities:

- Extract handleDNSQuery from serveDNS handler function
- Create dedicated startListeners function for listener management
- Add standardQueryRequest struct to encapsulate query parameters
- Split special domain handling into separate function
- Add descriptive comments for each new function
- Improve variable names for better clarity (e.g., startTime vs t)

This refactoring improves code maintainability and readability without
changing the core DNS proxy functionality.
By looking for any additional dnsmasq configuration files under
/tmp/etc, and handling them like default one.
This change improves compatibility with newer UniFi OS versions while
maintaining backward compatibility with UniFi OS 4.2 and earlier.
The refactoring also reduces code duplication and improves maintainability
by centralizing dnsmasq configuration path logic.
Split the long proxy method into several smaller methods to improve maintainability
and testability. Each new method has a single responsibility:

- initializeUpstreams: handles upstream configuration setup
- tryCache: manages cache lookup logic
- tryUpstreams: coordinates upstream query attempts
- processUpstream: handles individual upstream query processing
- handleAllUpstreamsFailure: manages failure scenarios
- checkCache: performs cache checks and retrieval
- serveStaleResponse: handles stale cache responses
- shouldContinueWithNextUpstream: determines if failover is needed
- prepareSuccessResponse: formats successful responses

This refactoring:
- Reduces cognitive complexity
- Improves code testability
- Makes the DNS proxy logic flow clearer
- Isolates error handling and edge cases
- Maintains existing functionality

No behavioral changes were made.
Logging there should use Log function to include the request ID if
present. Changes were made unintentionally during the refactoring to
eliminate usage of global logger.

This commits message restores the correct/old behavior.
…tion

- Move version checking logic to shouldUpgrade for testability
- Move upgrade command execution to performUpgrade
- selfUpgradeCheck now composes these two for clarity
- Update and expand tests: focus on logic, not side effects
- Improves maintainability, testability, and separation of concerns
The validation was added during v1.4.0 release, but causing one-liner
install failed unexpectedly.
- Add filterEmptyStrings utility function for consistent string filtering
- Replace inline slices.DeleteFunc calls with filterEmptyStrings
- Apply filtering to osArgs in addition to command args
- Improves code readability and reduces duplication
- Uses slices.DeleteFunc internally for efficient filtering
- Split handleRecovery into focused helper methods for better maintainability:
  * shouldStartRecovery: handles recovery cancellation logic
  * createRecoveryContext: manages recovery context and cleanup
  * prepareForRecovery: removes DNS settings and initializes OS resolver
  * completeRecovery: resets upstream state and reapplies DNS settings
  * reinitializeOSResolver: reinitializes OS resolver with proper logging
  * Update handleRecovery documentation to reflect new orchestration role

- Improve tests:
  * Add newTestProg helper to reduce test setup duplication
  * Write comprehensive table-driven tests for all recovery methods

This refactoring improves code maintainability, testability, and reduces
complexity while maintaining the same recovery behavior. Each method now
has a single responsibility and can be tested independently.
- Add explicit foundDefaultRoute boolean variable to track default route discovery
- Initialize foundDefaultRoute to false and set to true only in success case
- Replace tautological condition `err == nil` with meaningful `foundDefaultRoute` check
- Fixes "tautological condition: nil == nil" linter error

The error occurred because err was being reused from net.Interfaces() call,
making the condition always true. Now we explicitly track whether a default
route was successfully found.
Replace github.com/rs/zerolog with go.uber.org/zap throughout the codebase
to improve performance and provide better structured logging capabilities.

Key changes:
- Replace zerolog imports with zap and zapcore
- Implement custom Logger wrapper in log.go to maintain zerolog-like API
- Add LogEvent struct with chained methods (Str, Int, Err, Bool, etc.)
- Update all logging calls to use the new zap-based wrapper
- Replace JSON encoders with Console encoders for better readability

Benefits:
- Better performance with zap's optimized logging
- Consistent structured logging across all components
- Maintained zerolog-like API for easy migration
- Proper field context preservation for debugging
- Multi-core logging architecture for better output control

All tests pass and build succeeds.
- Add condition to skip port 53 attempts when using zero IP address
- Improve error logging by using structured error field instead of string formatting
- Remove redundant error information from log message format

The changes prevent unnecessary port 53 binding attempts when using zero IP
addresses and improve log readability by using zap's structured error fields.
- Add NoticeLevel constant using zapcore.WarnLevel value (1)
- Implement custom level encoders (noticeLevelEncoder, noticeColorLevelEncoder)
- Update Notice() method to use custom level
- Add "notice" case to log level parsing in main.go
- Update encoder configurations to handle NOTICE level properly
- Add comprehensive test (TestNoticeLevel) to verify behavior

The NOTICE level provides visual distinction from INFO and ERROR levels,
with cyan color in development and proper level filtering. When log level
is set to NOTICE, it shows NOTICE and above (WARN, ERROR) while filtering
out DEBUG and INFO messages.

Note: NOTICE and WARN share the same numeric value (1) due to zap's
integer-based level system, so both display as "NOTICE" in logs for
visual consistency.

Usage:
- logger.Notice().Msg("message")
- log_level = "notice" in config
- Supports structured logging with fields
Add CommandRunner interface and ServiceManager types to support
dependency injection and better separation of concerns in command handling.
Create separate file for log command handling to improve code organization.
Add LogCommand struct with SendLogs and ViewLogs methods to handle
log-related operations with proper error handling and dependency injection.
Create separate file for service command handling to improve code organization.
Add ServiceCommand struct with Install, Uninstall, Start, Stop, and Status
methods to handle service operations with proper error handling and dependency
injection.
cuonglm added 3 commits March 5, 2026 16:51
Consolidate DoH/DoH3/DoQ transport initialization into a single
SetupTransport method and introduce generic helper functions to eliminate
duplicated IP stack selection logic across transport getters.

This reduces code duplication by ~77 lines while maintaining the same
functionality.
Replace boolean rebootstrap flag with a three-state atomic integer to
prevent concurrent SetupTransport calls during rebootstrap. The atomic
state machine ensures only one goroutine can proceed from "started" to
"in progress", eliminating the need for a mutex while maintaining
thread safety.

States: NotStarted -> Started -> InProgress -> NotStarted

Note that the race condition is still acceptable because any additional
transports created during the race are functional. Once the connection
is established, the unused transports are safely handled by the garbage
collector.
Implement TCP/TLS connection pooling for DoT resolver to match DoQ
performance. Previously, DoT created a new TCP/TLS connection for every
DNS query, incurring significant TLS handshake overhead. Now connections are
reused across queries, eliminating this overhead for subsequent requests.

The implementation follows the same pattern as DoQ, using parallel dialing
and connection pooling to achieve comparable performance characteristics.
@cuonglm cuonglm force-pushed the release-branch-v2.0.0 branch from e1cbccf to acbc9fd Compare March 5, 2026 10:12
cuonglm and others added 13 commits March 5, 2026 17:24
Remove the transport Close() call from DoH3 error handling path.
The transport is shared and reused across requests, and closing it
on error would break subsequent requests. The transport lifecycle
is already properly managed by the http.Client and the finalizer
set in newDOH3Transport().
Disable warnings from ghw library when retrieving chassis information.
These warnings are undesirable but recoverable errors that emit unnecessary
log messages. Using WithDisableWarnings() suppresses them while maintaining
functionality.
Add DNS suffix matching for non-physical adapters when domain-joined.
This allows interfaces with matching DNS suffix to be considered valid
even if not in validInterfacesMap, improving DNS server discovery for
remote VPN scenarios.
Remove separate watchLinkState function and integrate link state change
handling directly into monitorNetworkChanges. This consolidates network
monitoring logic into a single place and simplifies the codebase.

Update netlink dependency from v1.2.1-beta.2 to v1.3.1 and netns from
v0.0.4 to v0.0.5 to use stable versions.
Add guard checks to prevent panics when processing client info with
empty IP addresses. Replace netip.MustParseAddr with ParseAddr to
handle invalid IP addresses gracefully instead of panicking.

Add test to verify queryFromSelf handles IP addresses safely.
Add connection health check in getConn to validate TLS connections
before reusing them from the pool. This prevents io.EOF errors when
reusing connections that were closed by the server (e.g., due to idle
timeout).
Replace the map-based pool and refCount bookkeeping with a channel-based
pool. Drop the closed state, per-connection address tracking, and
extra mutexes so the pool relies on the channel for concurrency and
lifecycle.
Replace the map-based pool and refCount bookkeeping with a channel-based
pool. Drop the closed state, per-connection address tracking, and extra
mutexes so the pool relies on the channel for concurrency and lifecycle,
matching the approach used in the DoT pool.
Treat "socket missing" (ENOENT) and connection refused as expected when
probing the log server, and only log when the error indicates something
unexpected. This prevents noisy warnings when the log server has not
started yet.

Discover while doing captive portal tests.
macOS Sequoia with Private Wi-Fi Address enabled causes os.Hostname()
to return generic names like "Mac.lan" from DHCP instead of the real
computer name. The /utility provisioning endpoint sends this raw,
resulting in devices named "Mac-lan" in the dashboard.

Fallback chain: ComputerName → LocalHostName → os.Hostname()

LocalHostName can also be affected by DHCP. ComputerName is the
user-set display name from System Settings, fully immune to network state.
Send all available hostname sources (ComputerName, LocalHostName,
HostName, os.Hostname) in the metadata map when provisioning.
This allows the API to detect and repair generic hostnames like
'Mac' by picking the best available source server-side.

Belt and suspenders: preferredHostname() picks the right one
client-side, but metadata gives the API a second chance.
@cuonglm cuonglm force-pushed the release-branch-v2.0.0 branch from acbc9fd to f44169c Compare March 5, 2026 10:24
Codescribe added 7 commits March 10, 2026 16:59
The continue statement only broke out of the inner loop, so
loopback/local IPs (e.g. 127.0.0.1) were never filtered.
This caused ctrld to use itself as bootstrap DNS when already
installed as the system resolver — a self-referential loop.

Use the same isLocal flag pattern as getDNSFromScutil() and
getAllDHCPNameservers().
Add platform-specific username detection for Control D metadata:
- macOS: directory services (dscl) with console user fallback
- Linux: systemd loginctl, utmp, /etc/passwd traversal
- Windows: WTS session enumeration, registry, token lookup
@cuonglm cuonglm force-pushed the release-branch-v2.0.0 branch from 606bd8b to 023969f Compare March 10, 2026 10:18
@cuonglm cuonglm force-pushed the release-branch-v2.0.0 branch from de415df to 1fbbb14 Compare March 10, 2026 10:42
This commit adds a new `ctrld log tail` subcommand that streams
runtime debug logs to the terminal in real-time, similar to `tail -f`.

Changes:
- log_writer.go: Add Subscribe/tailLastLines for fan-out to tail clients
- control_server.go: Add /log/tail endpoint with streaming response
  - Internal logging: subscribes to logWriter for live data
  - File-based logging: polls log file for new data (200ms interval)
  - Sends last N lines as initial context on connect
- commands.go: Add `log tail` cobra subcommand with --lines/-n flag
- control_client.go: Add postStream() with no timeout for long-lived connections

Usage:
  sudo ctrld log tail          # shows last 10 lines then follows
  sudo ctrld log tail -n 50    # shows last 50 lines then follows
  Ctrl+C to stop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants